1 Overview

Quick guide

1.1 HeatMap

1.2 Name Mapping

Mapping of Raw files to their short names Mapping source: automatic (automatic shortening of names was not sufficient - see ‘best effort’)
from to best.effort
80_10a file 001 80_10a
80_10b file 002 80_10b
80_10c file 003 80_10c
80_10d file 004 80_10d
80_12a file 005 80_12a
80_12c file 006 80_12c
80_12d file 007 80_12d
80_14a file 008 80_14a
80_18a file 009 80_18a
80_18b file 010 80_18b
80_18d file 011 80_18d
80_20a file 012 80_20a
80_20b file 013 80_20b
80_20c file 014 80_20c
80_20d file 015 80_20d
80_7a file 016 80_7a
80_7b file 017 80_7b
80_7c file 018 80_7c
80_7d file 019 80_7d
C_18b file 020 C_18b
C_18c file 021 C_18c
C_7b file 022 C_7b
2_10b file 023 2_10b
2_10d file 024 2_10d
2_12a file 025 2_12a
2_12b file 026 2_12b
2_12c file 027 2_12c
2_12d file 028 2_12d
2_18a file 029 2_18a
2_18b file 030 2_18b
2_18d file 031 2_18d
2_20a file 032 2_20a
2_20b file 033 2_20b
2_20c file 034 2_20c
2_20d file 035 2_20d
2_7a file 036 2_7a
2_7b file 037 2_7b
2_7c file 038 2_7c
tkQE171103_JL601_Control_1_10_02 file 039 tkQE171103_JL601_Control_1_10_02
tkQE171103_JL601_Control_1_12 file 040 tkQE171103_JL601_Control_1_12
tkQE171103_JL601_Control_1_18 file 041 tkQE171103_JL601_Control_1_18
tkQE171103_JL601_Control_2_12 file 042 tkQE171103_JL601_Control_2_12
tkQE171103_JL601_Control_2_18 file 043 tkQE171103_JL601_Control_2_18
tkQE171103_JL601_Control_3_18_02 file 044 tkQE171103_JL601_Control_3_18_02
tkQE171103_JL601_NCBP3_1_10_02 file 045 tkQE171103_JL601_NCBP3_1_10_02
tkQE171103_JL601_NCBP3_1_12 file 046 tkQE171103_JL601_NCBP3_1_12
tkQE171103_JL601_NCBP3_1_18 file 047 tkQE171103_JL601_NCBP3_1_18
tkQE171103_JL601_NCBP3_2_18 file 048 tkQE171103_JL601_NCBP3_2_18
tkQE171103_JL601_NCBP3_2_7_02 file 049 tkQE171103_JL601_NCBP3_2_7_02
tkQE171103_JL601_NCBP3_3_10 file 050 tkQE171103_JL601_NCBP3_3_10
tkQE171103_JL601_NCBP3_3_12 file 051 tkQE171103_JL601_NCBP3_3_12
tkQE171103_JL601_NCBP3_3_18_02 file 052 tkQE171103_JL601_NCBP3_3_18_02
tkQE171103_JL601_NCBP3_3_7_02 file 053 tkQE171103_JL601_NCBP3_3_7_02
2C_12a file 054 2C_12a
tkQE171103_JL601_Control_1_7 file 055 tkQE171103_JL601_Control_1_7
tkQE171103_JL601_Control_2_10 file 056 tkQE171103_JL601_Control_2_10
tkQE171103_JL601_NCBP3_1_7 file 057 tkQE171103_JL601_NCBP3_1_7
tkQE171103_JL601_NCBP3_2_10 file 058 tkQE171103_JL601_NCBP3_2_10
80_14d file 059 80_14d
C_14d file 060 C_14d
C_18a file 061 C_18a
2C_18a file 062 2C_18a
tkQE171103_JL601_Control_3_10 file 063 tkQE171103_JL601_Control_3_10
tkQE171103_JL601_Control_3_12 file 064 tkQE171103_JL601_Control_3_12
80_14b file 065 80_14b
C_20a file 066 C_20a
tkQE171103_JL601_NCBP3_2_12 file 067 tkQE171103_JL601_NCBP3_2_12
2C_12b file 068 2C_12b
2C_12d file 069 2C_12d
2C_14a file 070 2C_14a
2C_14b file 071 2C_14b
2C_18c file 072 2C_18c
2C_20a file 073 2C_20a
2C_20b file 074 2C_20b
2C_20c file 075 2C_20c
2C_20d file 076 2C_20d
2C_7c file 077 2C_7c
2C_7d file 078 2C_7d
C_10c file 079 C_10c
C_10d file 080 C_10d
C_20d file 081 C_20d
C_7d file 082 C_7d
80_14c file 083 80_14c
C_14a file 084 C_14a
C_14c file 085 C_14c
C_20c file 086 C_20c
C_10b file 087 C_10b
C_12a file 088 C_12a
C_14b file 089 C_14b
C_7a file 090 C_7a
C_20b file 091 C_20b
80_18c file 092 80_18c
C_12c file 093 C_12c
C_18d file 094 C_18d
C_7c file 095 C_7c
2C_12c file 096 2C_12c
2C_14c file 097 2C_14c
2C_18d file 098 2C_18d
2C_7a file 099 2C_7a
2_14a file 100 2_14a
2_14b file 101 2_14b
2_14c file 102 2_14c
2_14d file 103 2_14d
2C_10b file 104 2C_10b
tkQE171103_JL601_Control_2_7_02 file 105 tkQE171103_JL601_Control_2_7_02
tkQE171103_JL601_Control_3_7_02 file 106 tkQE171103_JL601_Control_3_7_02
2C_14d file 107 2C_14d
2_10a file 108 2_10a
C_12d file 109 C_12d
2C_18b file 110 2C_18b
C_10a file 111 C_10a
2_10c file 112 2_10c
2C_10a file 113 2C_10a
2C_10c file 114 2C_10c
2C_10d file 115 2C_10d
2C_7b file 116 2C_7b

1.3 Metrics

1.3.1 PG: PCA of ‘raw intensity’


(excludes contaminants)

↓ Show Help

Principal components plots of experimental groups (as defined during MaxQuant configuration).

This plot is shown only if more than one experimental group was defined. If LFQ was activated in MaxQuant, an additional PCA plot for LFQ intensities is shown. Similarly, if iTRAQ/TMT reporter intensities are detected.

Since experimental groups and Raw files do not necessarily correspond 1:1, this plot cannot use the abbreviated Raw file names, but instead must rely on automatic shortening of group names.

Heatmap score: none (since data source proteinGroups.txt is not related 1:1 to Raw files)


1.3.2 PG: PCA of ‘lfq intensity’


(excludes contaminants)


back to top

1.3.3 EVD: Top5 Contaminants per Raw file

↓ Show Help

PTXQC will explicitly show the five most abundant external protein contaminants (as detected via MaxQuant’s contaminants FASTA file) by Raw file, and summarize the remaining contaminants as ‘other’. This allows to track down which proteins exactly contaminate your sample. Low contamination is obviously better. The ‘Abundance class’ models the average peptide intensity in each Raw file and is visualized using varying degrees of transparency. It is not unusual to see samples with low sample content to have higher contamination. If you see only one abundance class (‘mid’), this means all your Raw files have roughly the same peptide intensity distribution.

Heatmap score [EVD: Contaminants]: as fraction of summed intensity with 0 = sample full of contaminants; 1 = no contaminants




back to top

1.3.4 EVD: Contaminants

↓ Show Help

User defined contaminant plot based on peptide intensities and counts. Usually used for Mycoplasma detection, but can be used for an arbitrary (set of) proteins.

All proteins (and their peptides) which contain the search string from the YAML file are considered contaminants. The contaminant’s search string is searched in the full FASTA header in proteinGroups.txt. If proteinGroups.txt is not available/found, only protein identifiers can be considered. The search realm used is given in the plot subtitle. You should choose the contaminant name to be distinctive. Only peptides belonging to a single protein group are considered when computing the fractions (contaminant vs. all), since peptides shared across multiple groups are potentially false positives.

Two abundance measures are computed per Raw file:

  • fraction of contaminant intensity (used for scoring of the metric)
  • fraction of contaminant spectral counts (as comparison; both should be similar)

If the intensity fraction exceeds the threshold (indicated by the dashed horizontal line) a contamination is assumed.

For each Raw file exceeding the threshold an additional plot giving cumulative Andromeda peptide score distributions is shown. This allows to decide if the contamination is true. Contaminant scores should be equally high (or higher), i.e. to the right, compared to the sample scores. Each graph’s subtitle is augmented with a p-value of the Kologorov-Smirnoff test of this data (Andromeda scores of contaminant peptides vs. sample peptides). If the p-value is high, there is no score difference between the two peptide populations. In particular, the contaminant peptides are not bad-scoring, random hits. These p-values are also shown in the first figure for each Raw file. Note that the p-value is purely based on Andromeda scores and is independent of intensity or spectral counts.

Heatmap score [EVD: Contaminant ]: boolean score, i.e. 0% (fail) if the intensity threshold was exceeded; otherwise 100% (pass).


back to top

1.3.5 EVD: peptide intensity distribution


RSD 4.5% (expected < 5%)

↓ Show Help

Peptide precursor intensity per Raw file from evidence.txt. Low peptide intensity usually goes hand in hand with low MS/MS identifcation rates and unfavourable signal/noise ratios, which makes signal detection harder. Also instrument acquisition time increases for trapping instruments.

Failing to reach the intensity threshold is usually due to unfavorable column conditions, inadequate column loading or ionization issues. If the study is not a dilution series or pulsed SILAC experiment, we would expect every condition to have about the same median log-intensity (of 223.0). The relative standard deviation (RSD) gives an indication about reproducibility across files and should be below 5%.

Depending on your setup, your target thresholds might vary from PTXQC’s defaults. Change the threshold using the YAML configuration file.

Heatmap score [EVD: Pep Intensity (>23.0)]: Linear scale of the median intensity reaching the threshold, i.e. reaching 221 of 223 gives score 0.25.





back to top

1.3.6 PG: intensity distribution


RSD 5.3% (w/o zero int.; expected < 5%)473.8% [high RSD –> few peptides])

↓ Show Help

Intensity boxplots by experimental groups. Groups are user-defined during MaxQuant configuration. This plot displays a (customizable) threshold line for the desired mean intensity of proteins. Groups which underperform here, are likely to also suffer from a worse MS/MS id rate and higher contamination due to the lack of total protein loaded/detected. If possible, all groups should show a high and consistent amount of total protein. The height of the bar correlates to the number of proteins with non-zero abundance.

Contaminants are shown as overlayed yellow boxes, whose height corresponds to the number of contaminant proteins. The position of the box gives the intensity distribution of the contaminants.

Heatmap score: none (since data source proteinGroups.txt is not related 1:1 to Raw files)





back to top

1.3.7 PG: LFQ intensity distribution


RSD 12% (w/o zero int.; expected < 5%)1077% [high RSD –> few peptides])

↓ Show Help

Label-free quantification (LFQ) intensity boxplots by experimental groups. Groups are user-defined during MaxQuant configuration. This plot displays a (customizable) threshold line for the desired mean of LFQ intensity of proteins. Raw files which underperform in Raw intensity, are likely to show an increased mean here, since only high-abundance proteins are recovered and quantifyable by MaxQuant in this Raw file. The remaining proteins are likely to receive an LFQ value of 0 (i.e. do not contribute to the distribution). The height of the bar correlates to the number of proteins with non-zero abundance.

Contaminants are shown as overlayed yellow boxes, whose height corresponds to the number of contaminant proteins. The position of the box gives the intensity distribution of the contaminants.

Heatmap score: none (since data source proteinGroups.txt is not related 1:1 to Raw files)





back to top

1.3.8 EVD: charge distribution

↓ Show Help

Charge distribution per Raw file. For typtic digests, peptides of charge 2 (one N-terminal and one at tryptic C-terminal R or K residue) should be dominant. Ionization issues (voltage?), in-source fragmentation, missed cleavages and buffer irregularities can cause a shift (see Bittremieux 2017, DOI: 10.1002/mas.21544 ). The charge distribution should be similar across Raw files. Consistent charge distribution is paramount for comparable 3D-peak intensities across samples.

Heatmap score [EVD: Charge]: Deviation of the charge 2 proportion from a representative Raw file (‘qualMedianDist’ function).





back to top

1.3.9 PG: Contaminant per condition

↓ Show Help

External protein contamination should be controlled for, therefore MaxQuant ships with a comprehensive, yet customizable protein contamination database, which is searched by MaxQuant by default. PTXQC generates a contamination plot derived from the proteinGroups (PG) table showing the fraction of total protein intensity attributable to contaminants. The plot employs transparency to discern differences in the group-wise summed protein abundance. This allows to delineate a high contamination in high complexity samples from a high contamination in low complexity samples (e.g. from in-gel digestion). If you see only one abundance class (‘mid’), this means all your groups have roughly the same summed protein intensity. Note that this plot is based on experimental groups, and therefore may not correspond 1:1 to Raw files.

Heatmap score: none (since data source proteinGroups.txt is not related 1:1 to Raw files)


back to top

1.3.10 MSMSscans: TopN

↓ Show Help

Reaching TopN on a regular basis indicates that all sections of the LC gradient deliver a sufficient number of peptides to keep the instrument busy. This metric somewhat summarizes ‘TopN over RT’.

Heatmap score [MS2 Scans: TopN high]: rewards if TopN was reached on a regular basis (function qualHighest)














back to top

1.3.11 MSMSscans: TopN over RT

↓ Show Help

TopN over retention time. Similar to ID over RT, this metric reflects the complexity of the sample at any point in time. Ideally complexity should be made roughly equal (constant) by choosing a proper (non-linear) LC gradient. See Moruz 2014, DOI: 10.1002/pmic.201400036 for details.

Heatmap score [MS2 Scans: TopN over RT]: Rewards uniform (function Uniform) TopN events over time.


















back to top

1.3.12 EVD: IDs over RT

↓ Show Help

Judge column occupancy over retention time. Ideally, the LC gradient is chosen such that the number of identifications (here, after FDR filtering) is uniform over time, to ensure consistent instrument duty cycles. Sharp peaks and uneven distribution of identifications over time indicate potential for LC gradient optimization. See Moruz 2014, DOI: 10.1002/pmic.201400036 for details.

Heatmap score [EVD: ID rate over RT]: Scored using ‘Uniform’ scoring function, i.e. constant receives good score, extreme shapes are bad.


















back to top

1.3.13 EVD: Peak width over RT

↓ Show Help

One parameter of optimal and reproducible chromatographic separation is the distribution of widths of peptide elution peaks, derived from the evidence table. Ideally, all Raw files show a similar distribution, e.g. to allow for equal conditions during dynamic precursor exclusion, RT alignment or peptide quantification.

Heatmap score [EVD: RT Peak Width]: Scored using BestKS function, i.e. the D statistic of a Kolmogoriv-Smirnoff test.


















back to top

1.3.14 MSMSscans: Ion Injection Time over RT

↓ Show Help

Ion injection time score - should be as low as possible to allow fast cycles. Correlated with peptide intensity. Note that this threshold needs customization depending on the instrument used (e.g., ITMS vs. FTMS).

Heatmap score [MS2 Scans: Ion Inj Time]: Linear score as fraction of MS/MS below the threshold.


















back to top

1.3.15 [experimental] MSMSscans: MS/MS intensity

↓ Show Help

MS/MS identifications can be ‘bad’ for a couple of reasons. It could be computational, i.e. ID rates are low because you specified the wrong protein database or modifications (not our concern here). Another reason is low/missing signals for fragment ions, e.g. due to bad (quadrupole/optics) ion transmission (charging effects), too small isolation windows, etc.

Hence, we plot the TIC and base peak intensity of all MS/MS scans (incl. unidentified ones) per Raw file. Depending on the setup, these intensities can vary, but telling apart good from bad samples should never be a problem. If you only have bad samples, you need to know the intensity a good sample would reach.

To automatically score this, we found that the TIC should be 10-100x larger than the base peak, i.e. there should be many other ions which are roughly as high (a good fragmentation ladder). If there are only a few spurious peaks (bad MS/MS), the TIC is much lower. Thus, we score the ratio BP * 10 > TIC (this would be 100% score). If it’s only BP * 3 < TIC, we say this MS/MS failed (0%). Anything between 3x and 10x gets a score in between. The score for the Raw file is computed as the median score across all its MS/MS scans.

Heatmap score [MS2 Scans: Intensity]: Linear score (0-100%) between 3 < (TIC / BP) < 10.











back to top

1.3.16 EVD: Oversampling (MS/MS counts per 3D-peak)

↓ Show Help

An oversampled 3D-peak is defined as a peak whose peptide ion (same sequence and same charge state) was identified by at least two distinct MS2 spectra in the same Raw file. For high complexity samples, oversampling of individual 3D-peaks automatically leads to undersampling or even omission of other 3D-peaks, reducing the number of identified peptides. Oversampling occurs in low-complexity samples or long LC gradients, as well as undersized dynamic exclusion windows for data independent acquisitions.

Heatmap score [EVD: MS2 Oversampling]: The percentage of non-oversampled 3D-peaks.





back to top

1.3.17 EVD: Uncalibrated mass error

↓ Show Help

Mass accurary before calibration. Outliers are marked as such (‘out-of-search-tol’) using ID rate and standard deviation as additional information (if available). If any Raw file is flagged ‘failed’, increasing MaxQuant’s first-search tolerance (20ppm by default, here: 20.0 ppm) might help to enable successful recalibration. A bug in MaxQuant sometimes leads to excessively high ppm mass errors (>104) reported in the output data. However, this can sometimes be corrected for by re-computing the delta mass error from other data. If this is the case, a warning (‘bugfix applied’) will be shown.

Heatmap score [EVD: MS Cal Pre (20.0)]: the centeredness (function CenteredRef) of uncalibrated masses in relation to the search window size.







back to top

1.3.18 EVD: Calibrated mass error

↓ Show Help

Precursor mass accuracy after calibration. Failed samples from precalibration data are still marked here. Ppm errors should be centered on zero and their spread is expected to be significantly smaller than before calibration.

Heatmap score [EVD: MS Cal-Post]: The variance and centeredness around zero of the calibrated distribution (function GaussDev).







back to top

1.3.19 MSMSscans: TopN % identified over N

↓ Show Help

Looking at the identification rates per scan event (i.e. the MS/MS scans after a survey scan) can give hints on how well scheduled precursor peaks could be fragmented and identified. If performance drops for the later MS/MS scans, then the LC peaks are probably not wide enough to deliver enough eluent or the intensity threshold to trigger the MS/MS event should be lowered (if LC peak is already over), or increased (if LC peak is still to weak to collect enough ions).

Heatmap score [MS2 Scans: TopN ID over N]: Rewards uniform identification performance across all scan events.














back to top

1.3.20 [experimental] EVD: Non-Missing Peptides


compared to all peptides seen in experiment

↓ Show Help

Missing peptide intensities per Raw file from evidence.txt. This metric shows the fraction of missing peptides compared to all peptides seen in the whole experiment. The more Raw files you have, the higher this fraction is going to be (because there is always going to be some exotic [low intensity?] peptide which gets [falsely] identified in only a single Raw file). A second plot shows how many peptides (Y-axis) are covered by at least X Raw files. A third plot shows the density of the observed (line) and the missing (filled area) data. To reconstruct the distribution of missing values, an imputation strategy is required, so the argument is somewhat circular here. If all Raw files are (technical) replicates, i.e. we can expect that missing peptides are indeed present and have an intensity similar to the peptides we do see, then the median is a good estimator. This method performs a global normalization across Raw files (so their observed intensitiy distributions have the same mean), before computing the imputed values. Afterwards, the distributions are de-normalized again (shifting them back to their) original locations – but this time with imputed peptides.

Peptides obtained via Match-between-run (MBR) are accounted for (i.e. are considered as present = non-missing). Thus, make sure that MBR is working as intended (see MBR metrics).

Warning: this metric is meaningless for fractionated data! TODO: compensate for lower scores in large studies (with many Raw files), since peptide FDR is accumulating!?

Heatmap score [EVD: Pep Missing]: Linear scale of the fraction of missing peptides.



1.3.21 [experimental] EVD: Non-missing by set


1.3.22 [experimental] EVD: Imputed Peptide Intensity Distribution of Missing Values





















back to top

1.3.23 EVD: Peptide ID count

↓ Show Help

Number of unique (i.e. not counted twice) peptide sequences including modifications (after FDR) per Raw file. A configurable target threshold is indicated as dashed line.

If MBR was enabled, three categories (‘genuine (exclusive)’, ‘genuine + transferred’, ‘transferred (exclusive)’ are shown, so the user can judge the gain that MBR provides.
Peptides in the ‘genuine + transferred’ category were identified within the Raw file by MS/MS, but at the same time also transferred to this Raw file using MBR. This ID transfer can be correct (e.g. in case of different charge states), or incorrect – see MBR-related metrics to tell the difference. Ideally, the ‘genuine + transferred’ category should be rather small, the other two should be large.

If MBR would be switched off, you can expect to see the number of peptides corresponding to ‘genuine (exclusive)’ + ‘genuine + transferred’. In general, if the MBR gain is low and the MBR scores are bad (see the two MBR-related metrics), MBR should be switched off for the Raw files which are affected (could be a few or all).

Heatmap score [EVD: Pep Count (>15000)]: Linear scoring from zero. Reaching or exceeding the target threshold gives a score of 100%.





back to top

1.3.24 EVD: ProteinGroups count

↓ Show Help

Number of Protein groups (after FDR) per Raw file. A configurable target threshold is indicated as dashed line.

If MBR was enabled, three categories (‘genuine (exclusive)’, ‘genuine + transferred’, ‘transferred (exclusive)’ are shown, so the user can judge the gain that MBR provides. Here, ‘transferred (exclusive)’ means that this protein group has peptide evidence which originates only from transferred peptide IDs. The quantification is (of course) always from the local Raw file. Proteins in the ‘genuine + transferred’ category have peptide evidence from within the Raw file by MS/MS, but at the same time also peptide IDs transferred to this Raw file using MBR were used. It is not unusual to see the ‘genuine + transferred’ category be the rather large, since a protein group usually has peptide evidence from both sources. To see of MBR worked, it is better to look at the two MBR-related metrics.

If MBR would be switched off, you can expect to see the number of protein groups corresponding to ‘genuine (exclusive)’ + ‘genuine + transferred’. In general, if the MBR gain is low and the MBR scores are bad (see the two MBR-related metrics), MBR should be switched off for the Raw files which are affected (could be a few or all).

Heatmap score [EVD: Prot Count (>3500)]: Linear scoring from zero. Reaching or exceeding the target threshold gives a score of 100%.





back to top